Assignment #1 – Image Synthesis Project 1

Colorizing the Prokudin-Gorskii Photo Collection

Overview

In this project, we are working on a collection of greyscale images from the famous Sergei Mikhailovich Prokudin-Gorskii. The images are three-channel (RGB) images for which may be which might have undergone some transformations such as translation or rotation. The goal of the assignment is to overlay the three images in a way that minimizes the artifacts and produces a color image.

Approach

We have followed several image processing techniques to achieve a perfect image composition. First, we started with Template matching by exhaustively minimizing Squared Differences (SSD) distance and Normalized Cross-Correlation (NCC), over a [-15, 15] moving grid of pixels. The values corresponding to the lowest SSD or highest NCC were stored as the best possible translation coordinates for image alignment. Second, since few of the images were high-resolution images. The brute-force solution would be expensive in resources; hence we implemented a faster search procedure using an image pyramid, which downsamples & processes the images & then upsamples the corresponding layers to obtain the final offset. Finally, we enchased our aligned image results using various image processing techniques such as Auto cropping, Auto contrasting, Auto-white balance, better features mapping, Aligning, and processing data from other sources.

explanation

Before we apply any of the image processing methods, we pre-processed the image by splitting it into the corresponding RGB channels. We assumed that each channel is equally divided in space. Also, since these channels might already have some boundries we shave off 3% of each dimension to obtain a highly refined image. After preprocessing each of these individual channels we applied both we computed SSD & NCC (see the formula below) using a moving [-15, 15] filter. We had a base image (green channel), and two reference images (red & blue channel). The function tracked the optimal values for both and stored the best translation offsets for each reference image with respect to the base image. Once, the obtained the offsets, we translated the reference images & aligned all of them with the base image. In almost all the images, we found NCC to be more robust than SSD. For large images we used image pyramid method, where in we downsampled the image by a factor of 2 and then upsampled it after processing the image.

\( \mathrm{NCC}(u, v) = \frac{\sum_{(x’,y’)} \left[{\{f(x’+u, y’+v) – \overline{f}}\}* \{{g(x’, y’) – \overline{g}\}}\right]}{ \sqrt{\sum_{(x’,y’)} \{{f(x’+u, y’+v) – \overline{f}}\}^2}\sqrt{\sum_{(x’,y’)} \{ {g(x’, y’) – \overline{g}\} }^2} }\)

\( \mathrm{SSD}(u, v) = {\sum_{(x’,y’)} \left[f(x’+u, y’+v)-g(x’, y’)\right]}^{2}\)

Results

NCC SSD Raw

NCC Loss Offset for Images

Image Name (Channel)	Offset X	Offset Y
cathedral.jpeg (G)	5	2
cathedral.jpeg (R)	12	3
emir.tiff(G)	49	24
emir.tiff (R)	104	42
harvesters.tiff (G)	61	16
harvesters.tiff (R)	124	13
icon.tiff (G)	41	17
icon.tiff (R)	89	23
lady.tiff (G)	59	-5
lady.tiff (R)	119	-10
self_portrait.tiff (G)	82	-2
self_portrait.tiff (R)	127	-8
three_generations.tiff (G)	55	12
three_generations.tiff (R)	112	10
train.tiff (G)	44	2
train.tiff (R)	88	30
turkmen.tiff (G)	56	18
turkmen.tiff (R)	115	26
village.tiff (G)	66	13
village.tiff (R)	127	24

SSD Loss Offset for Images

Image Name (Channel)	Offset X	Offset Y
cathedral.jpeg (G)	5	2
cathedral.jpeg (R)	12	3
emir.tiff(G)	49	24
emir.tiff (R)	104	42
harvesters.tiff (G)	71	39
harvesters.tiff (R)	124	13
icon.tiff (G)	41	17
icon.tiff (R)	89	23
lady.tiff (G)	59	-5
lady.tiff (R)	117	-10
self_portrait.tiff (G)	82	-2
self_portrait.tiff (R)	127	-8
three_generations.tiff (G)	55	12
three_generations.tiff (R)	112	10
train.tiff (G)	44	2
train.tiff (R)	88	30
turkmen.tiff (G)	56	18
turkmen.tiff (R)	115	26
village.tiff (G)	66	13
village.tiff (R)	127	24

Bells & Whistle

Auto-Crop

For this usecase, we extracted a portion of top-left & bottom-right image of the aligned color image. We observed that in case of misaignment there’s a black border across the entire image. We then applied Sobel filter to find the edges. Using this we were able to find the offset that we need to shift to crop the entire image. We then applied image transformation techniques to obtain the final image.

RESULTS

NCC Cropped

Auto-WB Adjusted

To obtain the white-black adjusted image, we computed the worked on the cropped image from above to minimize the image artifacts. We then computed the mean across each channel, and normalized the values across them. The final image had an equitable histogram distribution.

RESULTS

Cropped WB Adjusted

Auto-Contrast

Fot the case of auto-contrast, we worked on the cropped image. We then calculated calculated grayscale histogram of transformed image. After this we located points to clip at 1% left & right levels. Finally, we used cv2 convertScaleAbs to obtain the final contrasted image.

RESULTS

Cropped Contrast Adjusted

Feature-Based Alignment

The technique we will use is often called “feature based” image alignment because in this technique a sparse set of features are detected in one image and matched with the features in the other image. A transformation is then calculated based on these matched features that warps one image on to the other. We split the image into three parts & then registered each channel with respect to a reference channel. Finally, all the channels were concatinated into one RGB color image.

RESULTS

Raw Feature-based alignment

4D Search

In this method, we searched the transformed grid, with translation (X & Y), scaling (0.5X to 2X increments of 0.5) & rotation (-1.5 to 2 degrees increments of 0.5 degrees). While the complexity increased significantly, we were able to see minor enhancements in the aligned image. As with above images, we then applied auto_crop to crop the unwanted borders in the image after the alignment of all the channels.

RESULTS

Raw 4D Search

Interpolation

In this technique, we used interpolation methods to align channels. We use various methods, but found nearest neighbor interpolation & bilinear interpolotion to be most effective. There were minor enhancements in the image quality for large images. We use PyTorch to implement this technique.

RESULTS

Raw Interpolation

References

Feature-Based Alignment

Interpolation

Image Pyramid